Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Adaptive partitioning and scheduling method of convolutional neural network inference model on heterogeneous platforms

Shaofa SHANG, Lin JIANG, Yuancheng LI, Yun ZHU

Journal of Computer Applications 2023, 43 (9): 2828-2835. DOI: 10.11772/j.issn.1001-9081.2022081177

Abstract （307）

HTML （9）

PDF （3025KB）（127）

Save

Aiming at the problems of low hardware resource utilization and high latency of Convolutional Neural Network （CNN） when performing inference on heterogeneous platforms， a self-adaptive partitioning and scheduling method of CNN inference model was proposed. Firstly， the key operators of CNN were extracted by traversing the computational graph to complete the adaptive partition of the model， so as to enhance the flexibility of the scheduling strategy. Then， based on the performance measurement and the critical path-greedy search algorithm， according to the sub-model running characteristics on the CPU-GPU heterogeneous platform， the optimal running load was selected to improve the sub-model inference speed. Finally， the cross-device scheduling mechanism in TVM （Tensor Virtual Machine） was used to configure the dependencies and running loads of sub-models in order to achieve adaptive scheduling of model inference， and reduce the communication delay between devices. Experimental results show that on GPU and CPU， compared to the method optimized by TVM operator， the proposed method improves the inference speed by 5.88% to 19.05% and 45.45% to 311.46% with no loss of model inference accuracy.

Table and Figures | Reference | Related Articles | Metrics